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Abstract 

We consider stationary ergodic processes indexed by Z or Z" whose finite dimensional 
marginals have laws which are absolutely continuous with respect to Lebesgue measure. 
Wc define an entropy theory for these continuous processes, prove an analog of the Shannon 
Breiman Mac Millan theorem and study more precisely the particular example of Gaussian 
processes. 

1 Introduction 

In [17] Shannon introduced a general theory of entropy designed to quantify the rate at which 
information is produced through the evolution of a stationary stochastic process. He gave two 
definitions: one for random variables assuming discrete values and the other for random variables 
which assume real values or, more generally, values in M. n . However these definitions represent but 
two aspects of a single notion. 

In 1958 Kolmogorov adapted Shannon's discrete version in his definition of entropy for dynami- 
cal systems which enabled him to solve an important open problem concerning the classification 
of measure-preserving transformations with continuous (in fact Lebesgue) spectrum. Shortly af- 
terwards Sinai modified and improved Kolmogorov's definition so that today one speaks of the 
Kolmogorov-Sinai, K.S., invariant. 



The K.S. entropy is also an invariant for stationary stochastic processes in as much as they may 
be represented as measure preserving transformations. However these processes frequently have 
infinite entropy particularly when the random variables take their values in a non-discrete space. 
For example all stationary Gaussian processes with absolutely continuous spectrum have the same 
infinite K.S. entropy and indeed they are all measure theoretically isomorphic to each other (Orn- 
stein [9]). 

For this reason we feel that a modification of the traditional definition of isomorphism should be 
considered which distinguishes between various processes with infinite K.S. entropy. This paper 
should be regarded as a move toward this end in that we produce an ^ invariant-^ based on Shan- 
non's second version of entropy (designed for continuous valued random variables). 
Shannon's entropy was extensively investigated, especially by the Russian school, in the late 1950's 
and a thorough account appears in Pinsker's book [14]. One of our main purposes is to clarify and 
extend known results in this area. 

The invariant investigated here ( naturally referred to as Shannon entropy ) is simply a normalised 
limit of entropy for W 1 valued random variables, which we compute for all stationary Gaussian 
processes in terms of their spectral measures. In general this entropy is finite even when the K.S. 
entropy is infinite. We show how this entropy changes when a stationary Gaussian process is sub- 
jected to a linear transformation. 

This is a topic closely related to the work of Wiener [20] and Kolmogorov [7] on linear prediction 
theory. It is well known that stationary Gaussian processes may be regarded as non-linear exten- 
sions of stationary linear sequences in Hilbert space (i.e. ^wide sense-^ - processes in the language 
of Doob [2] ) and entropy theory may be regarded as the prediction theory of non-linear processes. 
Wiener was particularly interested in non- linear prediction and in [21] he attempted to prove that 
all stationary processes satisfying certain mild conditions are isomorphic to independent Gaussian 
processes. As shown by Rosenblatt [15] his proof was flawed and it is interesting to note that his 
mistake, which concerned the behaviour of decreasing sequences of sigma algebras, was repeated by 
Kolmogorov (as shown By Rokhlin) in a different context some years later. We now know through 
the work of Ornstein and his co-workers that Wiener's claim is actually false. 
We shall be concerned almost exclusively with stationary stochastic processes X = (X n ) for which 
the distribution measure of (Xq, ...,X n ) is absolutely continuous with respect to n + 1 dimen- 
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sional Lebesgue measure for all n = 0, 1, ... Our first theorem shows that for any ergodic measure- 
preserving transformation T of a probability space there exist functions F such that (FoT n ) satifies 
the above condition. We prove thereafter that an entropy ^ a la Kolmogorov-^ can be defined using 
the continuous entropy definition of Shannon. We prove that these averages of entropies always 
converge. The theory extends naturally to Z n actions. In this framework, we prove a Shannon 
Mac Millan Breiman type pointwise theorem. However, if the Kolmogorov-Sinai entropy of the 
transformation is finite, then the limit above is always — oo. In general, this limit is majorized by 
7}log(2ir) plus one half of the variance of the observable. In particular, the preceding inequality 
reduces to equality if and only if the process is Gaussian independent ( This generalizes to processes 
a result of Shannon for the case of random variables). We give also a similar characterization for a 
stationary process to be Markovian, and, more generally, to be with memory p. The theory applies 
naturally to Gaussian processes, for which, we give a closed formula for this continuous entropy. It 
turns out that, in the Gaussian Markovian case, this entropy determines the process, and in the 
Gaussian case, when finite it determines the process up to unilateral isomorphism. 
We give also some relationships with the rate of entropy of Pinsker and with the rate of generation 
of information as well. 

William Parry, our co-author, died August 20-th 2006, at the time where these notes were com- 
pleted. 



2 Absolutely continuous processes based on an ergodic system 



We begin with 
Definition 2.1: 

A real valued discrete time process X = (X n ) ne j, indexed by a countable set I, is said to be abso- 
lutely continuous if for every finite subset K of I, with cardinality \ K \, the joint distribution of 
{X n ) n< zK is absolutely continuous with respect to the \ K {-dimensional Lebesgue measure. 
We shall only be using I = Z, or / = 1? . 
In this section we prove the following: 
Proposition 2.2: 



3 



Let (Q,T,fj,) be an invertible ergodic dynamical system. Then there exists F £ L 2 (fi) such that the 
process (F o T n ) is absolutely continuous. 

Proof: Let (Y, S, 7) be the independent gaussian dynamical system: Y := R z , S the shift transfor- 
mation: (Sy)j = 2/7+1, for y £ Y, j £ Z, and 7 the product measure of the measures 7^, j £ Z, where, 
for every j, 7^ = (2ir)~ ^ exp(— ^x 2 )dl(x), I being the Lebesgue measure on R. Let Y n : Y — > R be 
the projection onto the n'th coordinate. 

According to Dye's Theorem, [3], there exists an integer- valued measurable function r : Y — ► Z 
such that if S4 : Y — > Y" is the transformation defined by S\(y) = S T ^(y), for 7-almost all y £Y, 
then the two dynamical systems (fi, T, ji) and (Y", ^1,7) are isomorphic. Let 6 : Q, — > 1" be a map 
giving the isomorphism: 6 oT = S\oQ and 7 = /x o . Set F = Yo o 6* and Z ra = Yq o Sf , for n £ Z, 
so that (F,F oT,...,F o T n ~ v ) and (Zo, Zi, ...Z n -\) have the same law, say a n . We show now that 
a n is absolutely continuous with respect to Lebesgue measure and this ends the proof. To do this, 
for every k = (k±, k n -±) £ Z n_1 , put 

^ = {r = fci, r o S fel = fc 2 , r o S k ^ = k 3 , r o £*!+■■■+*»-» = k n ^}, 

and let Tj. denote the sigma-algebra generated by Yo, Yj^, Yfci+...+fc n _i- In view of the definition 
of (Y, 5, 7), straightforeward computations show then that a n is absolutely continuous with respect 
to Lebesgue measure and has the density g := Y2kez n 9k-> where 

g- k (y ,...,y n -i) = E 7 [l E - k | ^](yo,2/i, -,yn-i) x (27r)~*exp(--(j/jj + ... + yl_ 1 )), 

and F 7 [i£- I Fj.} denotes the conditional expectation with respect to 7 of i£- given the sigma 
algebra JF^.D 
Remark 2.3: 

(1) The function F in Lemma 2 can be taken ( as the proof shows ) in the intersection of {L p {n) : 
p> 1}. 

(2) We can show that there is F such that, for every n, the law of (F, FoT, FoT n ~ l ) is equivalent 
to the n-dimensional Lebesgue measure. 

In the same way we have 



4 



Proposition 2.4: 

IfT and S are measure preserving transformations which commute on a probability space (QjJ 7 ,^), 
such that the joint action is ergodic then there exists F G L 2 (ji) such that the process (F o T m o 
'5 n )(m,n)GZ 2 * s absolutely continuous. 



3 Notations, a few prerequisites and a lemma 



3.1 Conditional entropy of probability measures 



In this subsection we recall various definitions attached to Shannon entropy [14 ]. 
Definition 3.1 : 

Let P and Q be probability measures defined on the same measurable space (fi, J 7 ). Let V be the set 
of all finite measurable partitions ofQ. IfH is in V let 

(1) S n (P | Q) := Y, P (*0M§fj)- 

Ben ^ ' 

Then for IIi G V, III finer than II implies 



(2) S n (P\Q)<S Ul (P\Q). 
The entropy Hq(P) of P with respect to Q is defined by 



(3) H Q (P) := sup S n (P \ Q)- 



We list without proofs some results which we are going to use. 



Theorem A (Gelfand, Yaglom, Perez, [14]): Let P,Q be two probabiliy measures on the 
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measurable space (f2,.F). Then 

If the entropy Hq(P) is finite then P is absolutely continuous with respect to Q and 

"«< p > = / >°^ dR 

(In particular if P is not absolutely continuous with respect to Q, Hq(P) = +co). 
This was introduced first by Shannon [ 17 ], for densities: 

If / 6 L l + {dx) is such that J R f(x)dx = 1, Shannon considered f R f{x)logf{x)dx. 



Let ip be the function defined for x > 0, by 

(4) ip(x) = —xlog(x). 

Remark 3.2: 

(1) If Hq (P) is finite then 

(5) HQiP) = J logi ^ )x ^ dQ = .J ti ^ )dQ . 
In particular, 

(2) H Q {P) = if and only ifP = Q. 

(3) More generally, If P is absolutely continuous with respect to Q then 

I' dP f dP 

(6) H Q (P) < oo J log{ — )dP < oo J (-^)( — )dQ < oo 



The following theorem follows from the monotonicity property (2). 

Theorem B (Dobrushin, [14]): Let P,Q be probability measures on (Q,^), C an algebra of sets 
belonging to T , which generates the sigma-algebra T , and let TZ be a family of finite partitions of 
U whose elements belong to C. If every partition consisting of sets from £ has a finer partition in 
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TZ, then 

H Q {P) = sup S U (P I Q). 

* Note that, as remarked by the translator of Pinsker, in Theorem B of Dobrushin, the condition 
that the elements of the partitions in TZ be in C is not necessary. 



3.2 A lemma 

The following lemma will play an essential role in the rest of the paper. 
Lemma 3.3: 

Let (fijj.Fi, Pj) be a probability space for i = 1,2, and v a probability measure on (fii x fi2, T\ <8> F2) 
with marginals v\ on (fii,Fi) and v 2 on {0.2, Tz). Then 

(7) H Pl xp 2 M = H UlXU2 (v) + Hp,[yx) + H P2 (v 2 ), 
and in particular 

(8) H Pl xP2 (v)>H Pl (v l )+ H P2 (u 2 ), 

(9) H PlX p 2 (v)>H vlXV2 (v). 

Proof: By Theorem B of Dobrushin, H Pl x p 2 (u) is given by the supremum, over all finite measurable 
partitions IT of fii and II2 of fi2, of the sums 5n lX n 2 ( i/ I Pi x Pi)- 

If v is not absolutely continuous with respect to !/i x 1/2, then, by Theorem A, H ulXU2 (v) = +00, 
and thus 

Hp lX p 2 {v) < +00 = H ulXU2 (u) + H Pl (vi) + Hp 2 (v 2 ). 
If v is absolutely continuous with respect to v\ x v 2 , the equalities 

^ >< ->'-JOT^> - ^ >< + i<>9 ti + *0 * ******* 



imply 

SiiixiiaO I Pi x P2) = Sn lX n 2 (f I ^1 x ^2) + £'n 1 (^i I A) + <S , n 2 (i / 2 I P2), (#1) 
and therefore 

Sn lX n 2 (^ I Pi x P 2 ) < H VlXV2 {v) + tfp^i) + Hp 2 (v 2 ), 
from which it follows that 

H PlxP2 (v) < H UlXU2 {v) + H Pl (^) + ffp 2 (// 2 ). 

To prove the reverse inequality 

H Pl x p 2 {v)> H U1 XU2 {u) + H Pl { Vl ) + Hp 2 {u 2 ), (E 2 ) 

consider four arbitrary finite measurable partitions: IIi, Ai of Q±, and II 2 , A 2 of Q 2 , and note 
that we can find a finite measurable partition Tj of refining both IT and Aj, i = 1,2. Then the 
partition Ti x T 2 := {M x N : M G Ti, N € T 2 } refines also IT x n 2 . But, in view of inequality 
(2), we have, for i = 1,2, 

SaM I Pi) < SrM I Pi). 

Similarly 

S , nixn 2 (f I fi x f 2 ) < S , r lX r 2 (f I ^1 x f 2 ). 
So by summing we get, by (Ei) 

SaM I Pi) + -W^ I P2) + 5n lX n 2 (^ I vi x 1*2) < Sr lX r 2 (^ I Pi x P 2 )) < H PlxP , 2 {v), 
from which (S 2 ) follows. This proves (7). As trivially (7) implies (8) and (9), the proof is finished. 
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Remark 3.4: 

The inequality (9) implies that, for fixed v, the infimum, over all probability measures P\ and P2, 
of Hp lX p 2 (v) is attained for P\ = v\ and P2 = V2, and the formula (7) shows that it is attained 
only for these particular values of P\ and P2. 

Corollary 3.5: 

Let v be a probability measure on a product measurable space. If the entropy Hp lX p 2 (v) of v, with 
respect to a product probability measure Pi x P2, is finite , then v is absolutely continuous with 
respect to the product v\ x V2 of its marginals, and these marginals are absolutely continuous with 
respect to P\ and P2 respectively. 



4 Shannon entropy of absolutely continuous processes 
4.1 Notation 

Let (Jl, T, fi) be an invertible ergodic dynamical system. Let A = {A\, A^} be a finite partition 
of Q. Let m on A z be the product measure with marginals giving equal weights to the atoms of 
A. Let F = Yj k j=Q a j^Aj be discrete with aj / aj for i / j, and F n (x) = (F(x), F{T n ~ 1 x)), for 
Then if fi n = fiF^ 1 and m n respectively are the restrictions of fi and m to Vj=o T^A, we 
obtain, with the standard definition of entropy of a partition 



In this paper we are interested in the case where F is continuous valued, say real valued, with 
(FoT n ) absolutely continuous (cf. Definition 2.1). In this case, denoting l n the Lebesgue measure 




n-l 
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on W 1 , we consider the function l n = I n (F) denned by: 



(10) I^^-log^jo^M, xen, 



together with its integral 

(11) H n = H n (F) := / I n dfi. 

Then, with ip as in (4), it follows 



(12) H n {F) = \ i) (^^)dl\ 



We focus on the asymptotic behavior of the sequences ^Z n and y t H n , which we call, respectively, 
the sequences of Shannon information and Shannon entropy associated to the process (FoT n ). 
Let 

(13) 70 = -^=exp{-\x 2 )dl{x), 7„ = jf n , n > 1. 

y Air £ 



The following quantities are closely related to Z n and H n : 

(14) F n Q = l niG (F) = -log^p^- o F n (H n>G = H n , G (F) = [ 2 n a{F)d^), 

djn J 

(15) l n ,p M = 1u,pm(F) = -log d ^f_\ )m ° Fn (H n ,p M (F) = J l niPM (FW). 

The link between these quantities, which behave very much the same, is made precise in the formulas 
(46) and (47). 

We shall employ each of the above quantities as seems appropriate. It should be clear that a result 
formulated using one is easily transformed into a result formulated in terms of the other. 
In the case of ergodic I? action we shall use the following notation 

If F : n -> M is measurable and K is a finite subset of Z 2 , let F K (x) = (F(T m S n x)) imjn)eK , 
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for x G Vt. If the law of Fk is absolutely continuous with respect to | K | -dimensional Lebesgue 
measure, we denote fx its density. In particular if K = G Z 2 : < i < n — 1, < j < n — 1}, 

we denote Fk by F n 2 and fx by f n 2. If the process (F o T m o S n )^ m ^ e x 2 i s absolutely continuous, 
as in one dimensional case, we consider 

(16) :=-logf n 2oF n 2, 



and its integral 




As our concern is the asymptotic behavior of ^jhn , there is no loss of generality if we suppose 
that f2 = M. 7,2 , F is the projection onto the zero coordinate, {Tx) g = x 9+ (o,i), {Sx) g = x 9+ (i 5 o), 
for x G M z2 ,<7 G Z 2 , and that /i is a probability measure whose finite dimensional marginals are 
absolutely continuous with respect to Lebesgue measure, and which is invariant by the shifts T and 
S. In this case all the densities f n 2, for various n, will be denoted by / without subscript, so that 
for every x G R , we have 

(18) ^h^(x)=-^logf(X:_ hn _ 1 ), 
where 

x i,j : = { x s,t){ s ,t)eil^ («o) 

and 

I?j ■= {(s,j) :s = 0, i} U {(a, t) : < s < n - 1, < t < j - 1}, (<*i) 
for 1 < i, j < n — 1, Iq q = {(0,0)} and I™ = {(s,0) : s = 0, Let, for future use, 

Y "j ■= (x s ,t)(s,t)ei^,(s,t)^(o,o)- (a 2 ) 
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In particular 

X n^n ■= {x s ,t)s,t=0,...,n, Oo) 

and 

:= ( x s,t)s,t=0,...,n,(s,t)^0- (°l) 

Let 

L := {(i, j,n) : < i,j < n- 1, n > 1}. (a 2 ) 
The set inclusion on the induces a partial order on L: we set 

(i, j,n) < (i',j',ri) Iij C Ijt'j, (j < j',n < n)or(j = j',i <i',n< ri). 

That is the product of the lexicographical order on {(j,i)} and the usual order on {n}. 
In the remark below, we single out two properties which we use later: 
Remark 4.1: 

(1) L is directed, and C := {n — 1, n — 1, n) : n > 1} is a cofinal subset of L. 

(2) if {{ikijki n k) '■ k G N} is an infinite subset of L, then there is an infinite subset J ofN, such 
that the sequence ((ik,jk, n k))keJ is strictly increasing in L, and lim^gjnfc + oo. 

For / = (i,j,n) G L, denote T\ or a{Xf-), the sigma-algebra generated by {x s j '■ (s,t) G Ifj}- 
Then, for 1,1' G L, I < V T x C T v . 

If m is a probability measure on M^ 2 , mfj will denote its restriction to T\ = a{Xf^), and then we 
write 

mlj := m \ a(X^), (r ) 
and in particular m™_ l n _ x will be denoted simply m n . 

Two particular probability measures ir and v on M n2 , will be useful for our purpose. Their finite 
dimensional marginals 7r„ = 7r"_ l n _ 1 and v n = are given by 

(19) 7T„= 11 f(x s , t )d\, 

(«.*)e^_i,„_i 
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and 



(20) 



n 



= f(x ,o) x /(Y^^dX, 



where Yl 



n—\,n—l 



is as in (a\) and A = A n 2 denotes n 2 -dimensional Lebesgue measure. 



Recall also that 



(21) 



where is as in (a ). 

4.2 Convergence of the Shannon entropy 

We now turn to the dynamical situation. We consider a dynamical system ($7,T, /i) and denote 
AC(Q, T, fi) the set of functions F G L 2 (fi) such that the process (F o T n ) is absolutely continuous 
(i.e. as in Definition 2.1). 

We establish the convergence of the sequence of Shannon entropy defined by (11), and give, 

as in the discrete valued case, an a priori upper bound for this limit and a criterion implying that 
the process {F oT n ) is Gaussian independent (Corollary 4.5). We also identify this limit ( Lemma 



F G L 2 (fi) such that the process (F o T m S n ) m , n £L is absolutely continuous (i.e. Definition 2.1). 
The next formula gives one of the announced links. 

Recall that 7„ denotes the independent Gaussian measure ( cf. (13) ), and ip is as in (4). 
Lemma 4.2: 

Let F e AC{n,T,n). Then 




action, with T and S as generators, AC(Q,T, S, fi) will denote the set of 



(22) 



H n (F) _ 1 



i r ^dj^_ i \ 2 

n y M u dj n 2 2 



n 



13 



I log(h(F(Ti X )))d»(x) 

3=0 

nx[^log(2ir) + ^\ \F\\i]n 
Remark 4.3: 

(i) The preceding formula (22) can be written as 

(23) H n (F) = -H^ n (jxF~ l ) + ^{log2ir+ \\ F 

(ii) If H n (F) is infinite then H n+ i(F) is infinite. 

In fact, (ii) follows from (i), and (i) from formula (5). 
Lemma 4.4: 

Let F £ AC(Q,T, fi). Then (H n (F)) ne N is a sub-additive sequence: for n,p £ N 

(24) H n+P < H n + H p . 

Proof: Inequality (24) follows immediately from Lemma 3.3 and formula (23). 

The following corollary is the analogue for the Shannon entropy of the fact that the Kolmogorv 
entropy of a countable states process is bounded by the entropy of the zerot'h coordinate. The 
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Proof: From formula (12), we get 



But, if h(t) := -^exp(-\t 2 ) for t € R, then 



/ ^Sr^ )din = /- Zo ^ d ^» 1 = -E / log{h(x 3 ))d^F-\x) 
= -n j log(h(F(x)))dfi(x) = -n[~log(2n) - X - I F(xfd^(x) 



equality case is analogous to the fact that in the Kolmogorov situation, the equality implies that 
the process is Bernoulli. 
Corollary 4.5: 

Let (Q,T,fi) be a dynamical system. Then for any F £ AC(£l,T, fi) the sequence ( ) of Shan- 
non entropies converges to Se(F,T), which may be infinite. Moreover: 

(25) Se(F,T)<±(log(2*)+\\F\\l), 

and the equality holds if and only if for every n the law fiF,^ 1 of (F, F o T, ...,F o T n_1 ) is the 
gaussian independent measure j n . 

Proof: The sequence (^) converges to its infimum, since (H n ) is sub-additive by Lemma 4.4. On 
the other hand, formula (23) implies ^ < ^(log(2ir)+ \\ F Hi), which yields then the inequality 
(25). 

To prove the other statement, note that from formula (23) and the equality Se(F,T) = inf{-^ : 
n G N}, it follows that the equality in (25) is equivalent to the equalities H ln {iiF~ v ) = 0, Vn. But 
this is equivalent to the equalities [iF~ x = j n , Vn by Remark 3.2 (2).\3 

Note that the preceding corollary generalizes to processes a theorem of Shannon that among ran- 
dom variables with fixed variance the maximum of the continuous entropy is achieved by a gaussian 
variable. 

A similar proof can yield an n dimensional version of this theorem of Shannon . 
Proposition 4.6 

Let Mi(n) ={p£ L\{l n ) : J Rn p(x)dl n (x) = 1, J Rn \\ x \\ 2 2 p{x)dl n {x) = n}. Then 

sup / i;(p(x))dl n (x) = ^(l+l g(27r)). 
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Furthermore this supremum is attained for the gaussian independent density 

p(x) = (2n)-^exp(~(xl + ... + x 2 n ^)) 
for x = (xo, ...,x n -i) € M. n , and this is the only one. 

We came to the main definition of this section: 
Definition 4.7: 

The Shannon entropy Se(F,T) of the process (F o T n ) is defined by the equality 
(26) Se(F,T) :=lim^. 

n n 



Next we identify the limit in definition 4.7, the Shannon entropy Se(F, T) of the process (F o T n ) 
using conditional entropy, or information ( Lemma 4.8 (b) (ii) and (Hi) ). 

Lemma 4.8: 

(a) Suppose that H n+p is finite. Then 

(27) H l xF- 1 x l xF- 1 (V F n+p) = H n + H p - H n+P , 

and, for fixed n, the sequence H^ F -i x ^ F -i([/,F~+ p ) is increasing in p. 

(b) If H n is finite for all n, then 

(i) Se(F,T) is finite if and only if for every n, or for some n, swp p H ixF -i x ^ F -i([iF~+ p ) is finite, 
(ii) 

(28) supff^-w p -^Fn+ P ) =H n -nx Se(F,T). 



(Hi) If v is the law of the process (F,F o T, ...) and v n is the probability measure on 1R N , with 



16 



n + p-marginal given by jiF n x fiF p , for any p > 0, then 

H Vn (v) = H n -nSe(F,T). 

In particular, if Se(F,T) is finite, v is abolutely continuous with respect to v n . 

Proof: (a) Formula (27) follows from Lemma 3.3 and formula (23). The other property follows 
from the definition, since, when II i and II2 are finite partitions of R n and W, respectively, we have 

Su 1 xU 2 (^Fn+p I V F n 1 x V-Fp 1 ) < Su 1 x(Il 2 xM.)(^ F n+p+l I V F n 1 x V F p+l)- 
(b) Put Vp := Hp — H n+P , and in particular, for n = 1, 

Up := —Vp = H p+ i - Hp. 

Then, from (a) above, the sequence (u p ) is decreasing. So, as jfJ2 p =i u p converges to Se(F,T), 
(u n ) converges also to Se(F,T) = mf p u p . This proves (%). (ii) follows from (i) and the equality 
v 1 ^ = —Up — u p+ i — ... — u n+p -\. (Hi) follows from (ii).D 

Remark 4.9: 

// Se(F, T) is finite then 

(29) l T^ H ^F-^F-^ F nlp) = 

and for fixed p, H /xF -i ><iiF -i(fxF~^ p ) is increasing in n, and z n := sup p H fiF -i > , fiF -i (jiF~+ p ) is 
sub- additive and increasing. 

For the convergence of Shannon entropy in the case of 1? action, for F G AC(Q,, T, S, fi), we have 
the following 
Lemma 4.10: 
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Let \i n and 7r n be as in (21) and (19) respectively. Then 

lim^H^ = - sup ^ff^ (//„)- / f(t)logf(t)dt. 
n n 2 n n 2 J R 

Proof: With notation as in (do), we can write /(^™-i,n-i) i n the following form 

/(^i,n-i) = n f) >< n fM> 

from which we get 



/ logf(X%_ 1>n _ 1 )dn= [ log=-^- — \ rdfi + ^ / logf(x S:t )f{x S:t )dx s>t . 

That is, 

(30) J logf(X™_ l n _ 1 )dn = H nn (lC-i,n-i) + n 2 j /(x ,o) W/(x ,o)^o,o- (****) 
Put z n := Hn n ([J>n-i n-i) ■ ( O ne can see that (— z n ) is sub-additive.) 

Let = Z 2 n [A?ra — 1, (A; + l)n - 1] x [Zn — 1, (Z + l)n - 1] and denote by /x | the restriction 
of \i to the coordinates in R!j: l and similarly for tt \ R^. v Let n be fixed and N > n be an integer. 
Write iV = »jv^ + r/v = where, p = pat, r = rjy G N, < r < n and p > 1. Then clearly we have, 
by the definition of the conditional entropy 

ZN = H-tt n (Pn-1,N-i) > ^7r P n(/ x pn-l,pn-l) = z pn- 

But, by the Lemma 3.3 and invariance, we obtain 

p-i 

^pn > ^ H *\RtM I ^ =P 2fl ir|J^ >0 (/ i I #0,o) =P^n- 
fe,/=0 
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So 



limsupf-— ^z N ] < — |, 

which implies 

limsup[--^-ZAr] < inf[-^].D 

N N z n n z 

As a consequence we can now define an entropy of absolutely continuous process indexed by Z 2 , as 
follows 

Definition 4.11: 

The Shannon entropy Se(F, T, S) of the absolutely continuous process (For m oS n )( m]n )g Z 2 is given 
by the equality 

(31) Se(F, T, S) := lim[-^ / logfo ° F n *dn] = lim . (*) 

71 ft J 71 ft 

Note that here too it follows from Lemma 4.10 that Se(F, T, S) = — J f(t)logf(t)dt if and only if 
the process (F o T m o S n )( m ^ eZ 2 is independent. 



In order to identify the limit in definition 4.11, for F G AC(Q,T, S, fi), we need some further 
notation. Let 

1 j J - :> • /(*o,o) >< <Kj ' 

where [i, v are as in (21) and (20), respectively, and /z™-, vf- are as in (ro). 
Remark 4.12: 

With gfj as in (21), (20) and (32), respectively, and L as in {02), the following properties 
hold: 
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The family (g™ j)^j jn )^L is a v martingale. 

(33) J gfJoggf d du < J g n n _^ n _ x loggl_^ n _ x dv. 

(34) sup / g^jloggfjdv = sup / gn-\, n -\ lo 99n-\,n-\ dv - 

(i,j,n)eLJ n J 

1 1 n_1 

Z^n ="^E lo 99?j - logf (x ,o). (E) 
n n ij=o 

Remark 4.13 

Notations are as in Remark J^.l2. Suppose sup n J g™_i n _ 1 logg™_ ± n _\dv < oo. Then if = 
(ik,jk,nk) is an increasing sequence in L, the martingale := g™*j k is uniformly integrable and 
converges v almost everywhere and in L 1 ^) to the density g s of fi restricted to a{{M^ : k > 1}) 
with respect to v. 

In particular, if U n := 9n-\, n -i> ^ en U n converges v almost everywhere to the density g of \x with 
respect to v on the sigma-algebra a({x St t : s,t> 0}) and therefore gfj = E u (g \ a{Xf-)). 

In fact, by (34), the hypothesis implies swp^j ^ eL J gfj \ logg^j \ dv < oo, so that the family 
(di'j)(i,j,n)£L is uniformly integrable with respect to v. Here we used the following 
Remark 4.14: 

Let m be a finite measure and <E> = {/, : i G 1} be a family of positive elements in L l (m) with the 
property 

sup J filogfidm < oo. 

Then (1) supj J /j | logfi \ dm < oo. 

(2) The family {fi : i £ 1} is uniformly integrable. 

(3) The family {(logfi) + : i £ 1} is uniformly integrable. 

The following lemma identifies the Shannon entropy of (F o S m o T n )( m n ) GZ 2 , for I? action (the 
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limit in Lemma 4.10). 
Lemma 4.15 

Let F 6 AC(U,T, S), \i and v be as in (21) and (20), respectively. Then 



(35) Se(F, T, S) = lim = -H„{n) - [ f(t)logf(t)dt, 

n n J 

and in particular Se(F, T, S) is finite if and only if H u (fj,) < oo. 
Proof: Equation (E) implies 

1 , n-i r 

i,j=0 J 



so 



> -^n_ i n i (^_ l n _ 1 ) - I f{t)logf{t)dt > -HM - J f(t)logf(t)dt, 

proving one direction. On the other hand, for every n, let t n = J gn-i,n-il°99n-i,n-idv = 
H <- hn -M-i,n-i), and u n = £ ESo^K,)- Hk>p, then 

^ fe-ifc-i fe-ifc-i 

i=0 j=p i=0 j=p 

= ^^^k-l-i) = (i - 1)^^,^-!) = (i - f)v 

Hence liminffc Uk > t p , which implies liminffc Uk > sup p t p , and proves the other direction and 
also the equality H u (fi) = lim n u n , or that ^tH^ converges to —H v ([j) — f f(t)logf(t)dt.\3 

We give examples of non Gaussian and non independent process with finite Shannon entropy. 
Lemma 4.16: 

Let (Q,/j,,T) be a dynamical system. Let F,G G L 1 (fi). Let £ n := (F, F o T, F o T n ~ l ), and 
r) n := (G, G o T, G o T n_1 ). Suppose that £ n is absolutely continuous and that £ n and rj n are 
independent. Then £ n + r\ n is absolutely continuous and 

(36) H n (F + G) > H n {F). 
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Proof: If H n (F) = — oo, there is nothing to prove. Suppose then that H n (F) > — oo. Denote by / 
the density, with respect to l n , of the law a of £„, by \x the law of ry n , and by v the law of £ n + rj n . 
Then v = a ★ /x = g<iP, and g(t) = f Rn f(t — y)d/j,(y), for /"-almost all t G M ra . Thus we can write 

H n {F + G)=f i>(g)dl n =[ v(/ f(t- y )d t x( y ))di n (t) 

Jr™ Jr™ it" 

> / [/ Hf(t-y)W(y)]dl n (t)= [ dfi(y) f 4>(f(u))dl n (u) = H n (F). 

JR n JR n JR n JR n 

Corollary 4.17: 

Let F 6 ^4C(i7, T, Assume that G G L 1 ^) is such that the two processes (F o T n ) n >o and 
(G o T n ) n >o are independent. Then 

(37) Se(F + G,T) > Se(F,T). 

If, in addition, G G AC(r2, T, ,[/), i/ien 

5e(F + G,T) > m&x{Se(F,T),Se(G,T)} > ^(Se(F,T) + Se(G,T)). 

4.3 Connections between Shannon entropy and Kolmogorov-Sinai entropy 

We show that if for some F 6 AC(fi, T), the Shannon entropy Se(F, T) is finite then the Kolmogorv 
entropy of (f2,T, //) is infinite (Corollary 4.21). We also give a new way to describe the Shannon 
entropy for Z action ( Theorem 4.20 (b)), and for I? action (Remark 4.22) . Finally, we obtain a 
criterion of Markovianness (Proposition 4.23). 

The following lemma is important for comparing the Kolmogorov-Sinai entropy to the Shannon 
entropy. 

If V is a finite measurable partition of a probability space (17, J 7 , P) then the entropy of V will be 
denoted H(V), or H(V,P) or H P (V). 
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Lemma 4.18: 

Let £ be the set of finite measurable partitions ofM., and F~ 1 £ = \F~ X V : V £ £}. Then 

n—l n—l V 

Proof: By Theorem B of Dobrushin, 

H ^F- 1 x^F p - 1 (^ F n+p) = sup S VlxV2 (ixF~l p | /iF" 1 x /xF- 1 ), 

the supremum being taken over all finite partitions V\ of R n and V 2 of K p , of the following particular 
forms: V\ = V x ... x V, n times, and V 2 = V x ... x V, p times, where V run in £. But it holds 

Sp lxVa (nF-^, I fiF- 1 x vF- 1 ) = -H{V± x V 2 ^Fnl P -i) + H^pF- 1 ) + H(P 2 , t iF~ 1 ), 

and, for these particular forms of Vi, i = 1,2, setting B := F~ X V , we easily have 

n-1 

H(V 1 ,fiF- 1 ) = H^\J T-W), 

3=0 

H(V 2 ^F p 1 ) = H^y^- 1 T-W), and H(P 1 x P 2 , = ^(VST* T~ j B). 

It follows then that 



rt— 1 n—l 



S Plxr2 (vF-i p I ^F" 1 x fiFp 1 ) = H»(\fTW) - H»(\jTW \ \J T^B).D 

3=0 j=o j=l 



In the light of lemma 4.18, formula (27) in Lemma 4.8 can be written as 

n—l n—l p 

(38) sup \/ TW) - H»( \/ | \/ T~%)] = H n + H p - H n+p . 

Remark 4.19: 

Formula (38) allows one to obtain conditions which ensure that H n will be finite. 
One can prove for instance that, for all N, the following are equivalent: 
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(i) Hn + \ is finite. 

(ii) suj> BeF -i s [H»(B) - H"{B | yf =1 T-W)] < +00. 

(ii') ^ F -i xmF -i(^+i) < +OO. 

The following theorem gives, in particular, the announced new description of the Shannon en- 
tropy for Z action. 
Theorem 4.20: 

Let £ be as in Lemma 4-18. Suppose H n finite for all n. Then 

(a) The following are equivalent: 

(i) Se(F,T) is finite. 

(ii) For every n (or for some n), sup BeF -i e [H^(y 1 J~QT j B) - nH fM (B \ B~)\ < +00. 
(in) lim n I sup BeF - 1£ [£P(V;=o TJ ®) ~ n H»(B \ B~)\ = 0. 

(b) The following equality holds 



(39) Se(F,T) = H x + ^ inf^H^B \ B~) - H»(B)} 



Proof: Formulas (38), (27) and (28) imply the following 

n—1 n—1 

(40) z n := sup [H»(\J TiB)-H»(\J TiB\B-)]=H n -nxSe(F,T), 
in which as usual B~ := T~ X B V T~ 2 B V Now 

n-l 

H"{\/ T j B I Br) = nH fl (B | B~), 

j=0 

so that formula (40) establishes the equivalence between (i) and (ii). As (i) implies (Hi), by Re- 
mark 4.9, and trivially (Hi) implies (ii), the proof is finished, because taking n = 1 in formula (40) 
yields equality (39). 
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Corollary 4.21: 

Let (fi, T, fj,) be an invertible dynamical system. If there exists F 6 AC(fi, T, suc/i that the Shan- 
non entropy Se(F, T) of the process (F o T n ) is finite then (Q, T, fx) has infinite entropy. 



Proof: The corollary follows directly from the equivalence between (i) and (ii) in Theorem 4.20. 
Now, for I? action, we establish a formula analog to (39) for Z action. 
Remark 4.22: 

Let £ be as in Lemma 4-18. Then, for the stationary absolutely continuous process (F o S m oT n ), 
indexed by 1? , the following holds 

(41) Se(F,T,S) = -J f(t)logf(t)dt+mf g [H»(F- l V \ {F^V)') - H^F^P)]. 

In fact, we have H v {(j) = sup n H Vn ^ n ). But 

H Vn (nn) = sup fin I v n ), 

the supremum being taken over all partitions IT of the form IT = {^(ij)^!" 1 i S~ t T~ J ' F~ 1 E : E £ 
V}, where V is a finite partition of R. But then, with Eij = S~ l T~ 3 F~ 1 E, from the definitions of 
fi n and u n , (see (21) and (20) ), it follows 

S n (Mn | i/ B ) = ^ l^(n(iJ)^. lin . 1 Eij)log (n ^-y = 

H' 1 (F- 1 V) - \J S- i T~ : >F- 1 V)+H' x { \J S^T^F^V) = 

(M")e^_i,„_i (<J')e^_i,„_i 

H»{F- l V) - H^(F^ 1 V | y S^T^F^V). 

(M)G^-l.n-l 
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Hence 



supH^M = suplH^F- 1 ?) - H^F^V \ (F^P)')}, 



n V 



where (F^P)~ = V n >! V(y) 6j ; 



n— l,n— 1 



T-iS-iF- 1 ?). 



By Lemma 4.15, we thus obtain (41). 

The following proposition gives a criterion for the Markoviannness of the process (F o T ra ). 
Proposition 4.23: 

Lei F e AC(Sl, T, m) such that Se{F, T) is finite. Then 

(i) The process (F o T n ) is Markovian if and only if Se(F,T) = H2 — H\. More generally 

(ii) The process (F o T n ) has memory p if and only if Se(F, T) = H p+ \ — H p . 

Proof: We prove (i). The proof of (ii) is similar. If (F o T n ) is Markovian, we see by formulas 
(13) and (15), that Se(F,T) = H 2 - H 1 . 
For the other direction, put, for n > 1, 



K n = TZ n {A) :=T- 2 AV...VT- n A, and L = L n = E[. \ T~ 1 A]-E[. \ rUvF]. Let / be bounded 
F _1 <S-measurable function, where, as in Lemma 7, £ denotes the set of finite measurable partitions 
of ML We shall show that L(f) = and the proof will be finished. To do this, we use a Lemma in 
[19], according to which for any e > 0, there exists 6(e), < 5(e) < e, such that for any probability 
space (Q, J 7 , fx), and any finite partitions V and Q of Q, the inequality -fP('P) — H^(V \ Q) < 5(e) 
implies that V and Q are e-inependent. 

Recall that following Ornstein, if V and Q are finite measurable partitions of a probability space 
JF, ^,), then V is said to be e-independent of Q if 



sup [H(A) - H(A) I T~*A V ... V T~ n A))], 
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for all atom q except a set of atoms of Q which union has a measure less than e. 
Fix e > 0. Then there exists Aq such that 

01 - 5(e) 2 < H(A) - H(A \ T~ X A), 

for any A which is finer than Aq. But, by formula (15), the hypothesis Se(F,T) = H2 — H\ is 
equivalent to a\ = a n for any n. Thus 

H{A I T- l A) - H(A I T~*A V TZ n (A)) < 6(e) 2 . 

So if we denote, respectively, by p, q and r, the generic element of A, T~ 1 A and 1Z n , we get 

Y,m(q)[H^(A q ) - H m «(A q \ K n q )\ < 5(e) 2 , 
1 

where m g (A) = , A q = {p f~l q : p G A} and similarly for K™. 

Let Q e := {q : H m «(A q ) - H m "(A q \ K%) > 5(e)}. It follows that 

^2 m(q)) < 5(e), (e x ) 

and that, for q £ Q e , the partitions A q and 7£™ are e- independent, under the measure m q , that is, 
there is J q , a subfamily of 1Z™, such that 

^ m g (r) > 1 - e, (e 2 ) 

r&J q 

and 

y~] I m(p I q n r) — m(p | (?) |< e, Vr 6 J g . (63) 

Now we can find .A finer than Aq, and g = g t = J^peAyp^P suc h that || / — g \\\< e, and 
II 9 I loo 

< 2x || / Hoo. Then 
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II L (g) lli= Yl I ^2yp\- m (P I -m(p \qnr)}\ m(qf]r) = ^ + ^ . 

<?,»* p <?eQ e g^Q e 

In view of (ei), the first sum in the above equality, is bounded by 2 || g 5(e). By (es) and (62), 

the second one is bounded by 3 || 51 ||oo xe. 

Therefore 

||^)||i<2||/|| 00 [25(e) + 3e]. 

It follows 

im/)||i<2||/|| 00 [25(e) + 3e] + 2e. 

This implies L(/) = O.D 

Note that if (F o T n ) is Markovian then for any p, H 2p +i = Hi + 2p(H 2 — Hi) and H 2p+2 = 
H 2 + 2p(H 2 - Hi). 

One might be tempted to introduce an isomorphism invariant Se(T): 

(42) Se(T):= sup {Se(F,T) - Hi(F,T)}. 

FeAC(n,Tfi) 

It is indeed an invariant, however, it can only take on the 2 values —00 and 0. 



5 Entropy rate and Shannon entropy 

We shall establish some connections between Shannon entropy and some concepts developped by 
Pinsker such as information stability and entropy rate. 

We recall some definitions from [14]. First recall that if Z is a random variable then its law is 
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denoted Pz- 



Definition 5.1: 



Let £ = (£ n )n>i and n = (rj n ) n >i be discrete time stationary processes. 



The entropy rate of £ with respect to n is 



H v (0 :=lim -Hp 



n n 



',n. 



)). 



defined when, for all j, £j and rjj take values in the same measurable space, and when the limit 
exists. 

Remark 5.2: 

We can prove, using Lemma 1, that if -q is the independent Gaussian process, then for any discrete 
time real state stationary process £, the entropy rate of £ with respect to r\ is well defined. 

Lemma 5.3: 

Let 7] = (X n ) be the independent gaussian process, with law 7 and 7„ its projection on the first n 
coordinates. Let (Q,T,fj,) be a dynamical system, F G AC (ft, T,n) and v the law of the process 
i := (F, F oT, ...). Then 

(i) Se(F,T) is finite if and only if sup n ^H ln {nF~ v ) < 00. 

(ii) v = 7 if and only i/lim n ^H ln {jiF~ l ) = 0; 
(Hi) If FL~ f (v) is finite then v = 7. 

Proof: Formula (23) implies 



from which we see that Se(F,T) is finite if and only if lim n ^H ln (/j,F n 1 ) is finite. This proves (i) 
because by super-additivity 



(43) 



Se(F,T) = \{log{2*)+ \\ F |||) - lim -H^F' 1 ), 
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Also this last equality together with Remark 3.2 (2) proves (ii). 

To prove (iii) note that H ln (nF~ l ) < H^(u), and formula (23) implies thus the inequality 

-H n (F) > \(log(2v)+ || F \\\) - -H^v) 
which in turn implies the following one 

Se(F,T)>±(log(2TT)+\\F\\ 2 2 ). 
Hence, by corollary 4.5, we have v = j.H 

Corollary 5.4: 

Let notations be exactly as in Lemma 5.3. Then 

(i) Se(F,T) is finite if and only if the entropy rate of ^ with respect to r\ is finite. 

(ii) v = 7 if and only if the entropy rate of £ with respect to rj vanishes, 
(iii) 

(44) Se(F,T) = \(log(2ir)+ \\ F \\ 2 2 ) -#„(£). 



Lemma 5.5: 

Let (CI, T, fx) be a dynamical system, F G AC (CI, T, fx) and v the law of the process £ := (F, FoT, ...). 
Let P be the prduct measure P := [iF^ 1 ® ® ... and P(n) = ([iF~ l )® n its projection to the 

first n coordinates. Then 

(i) If for alln G N, H n is finite (in particular if Se(F,T) is finite) v is locally absolutely continuous 
with respect to P: Vn^F" 1 << (fiF~ 1 ) (g>n . 

(ii) Se(F,T) is finite if and only if sup n ^H^p-i^n (ijlF~ v ) is finite. 
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(iii) u = P <=> lim n ii7 (MjF - 1) ^(/xF- 1 )=0 ^=> S e(F,T) = f R 4,(^)dl. 

(iv) If Hp{y) is finite then u = P. 
(v) 

(45) Se(F,T)= f ^{^L^)dl - sup -H^p-^^F' 1 ). 



Proof: Recalling that T n {F) and l n ,PM{F) are defined respectively by (10) and (15), we have the 
following formula 

n ~ l rl F^ 1 

(46) l n (F) = l n , PM (F) - ^a^T ° F ° T '' 

3=0 

which can be written as 

a proof of which is as follows. 

First, by taking in formula (7) in Lemma 3.3, P\ = (fj<F~ 1 )® n and P2 = fiF^ 1 , we find 
H (nF-i)<S»x f iF-i(fJ' F n+l) = H ^F- 1 x fJ ,F-^^ F n+l) + H^p-i^n (flF~ 

+H f ,p- 1 (fiF- 1 ) = H^^^F-^) + H^p-^iiiF- 1 ). 
And next, by taking j> = 1 in formula (27), we find 

H^F-l^n+l^F^) = H n + Hl~ H n+ l + F( /jF -l)®n(^F~ 1 ), 

from which it follows, when H n+ \ is finite, that H^ F -i^ n +i (/xi ? J ^ 1 ) is finite if and only 

if i/( AtF -i)i8n(^F~ 1 ) is finite. This, using Theorem A, implies, by induction on n, that if H m is 

finite for all m, then for every n, [iF~ x is absolutely continuous with respect to ([iF~ v )® n . Thus 
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we can write 

dl n ~ d{^F- l )® n X d¥ ' 

So, by taking logarithms, we obtain formula (46), from which follow immediately the formula (45), 
(m), (Hi) and (iv). 

Remark 5.6: 

Formula (46) implies 

H n = H ritPM + n x [hog(2ir) + l\\F\\l -H^F' 1 )]. 
In the same way we have 

(47) l n (F) = l niG (F) + \log{2*) + \ ]T F 2 o T J 

and therefore (cf. formulas (22) and (23) ) 

H n = H ntG + nx [hog(2Tr) + ^ \\ F |||]. 

As a corollary, we obtain the following criterion for independence: 
Corollary 5.7: 

Se(F, T) = Hi(F) if and only if the process (F o T n ) is independent. 

Note that Corollary 5.7 can also be proved by using formula (7) and (23). 

Corollary 5.7 together with formula (23) give the following improvement of Corollary 4.5 
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\x 2 ) with respect to Lebesgue measure I 

(48) Se(F,T) < — i? 70 (^F _1 ) + ±(log(2ir)+ \\ F |||). 

and the equality 

Se(F,T) = -H^F- 1 ) + \(log{2n)+ \\ F |||) 
/io/ds i/ and only if the process (F o T n ) is independent. 

Note, once more, that we see from this corollary that Se(F,T) = ^(log(2ir)+ \\ F j^) if and 
only if the process (F o T n ) is Gaussian independent [cf. Corollary 4.5]. 

We can also prove the following, which, in particular, improves the inequality (48) in the preceding 
corollary, and gives a link between the Shannon entropy and information stability. 

Lemma 5.9: 

If £ = (£rt)n>i and 7] = {j] n ) n >i are discrete time stationary processes, the rate of generation of 
information about n by £ or about £ by n ( following Pinsker) is 

J(£,77) :=I™^P« 1 ,..., e „)xP (w ,...,^ ) (% 1 ... ) €„),(^ ) ... ) ^))- 
The pair (£,?/) is called information stable if I(^,n) = 0. 

Let (f2,T, /i) be an invertible dynamical system and F £ AC(Q,T, fi). Let <\> and ir be the processes 
defined by <\> n = F o T~ n+1 , and vr n = F o T n , for n = 1, 2, ... Then 

(i) The pair (<f), it) is information stable if and only if lim n (^ — -^f) = 0. In particular, if Se(F, T) 
is finite the pair ((f), ir) is information stable. 



Corollary 5.8: 

Let 70 be the pi 

onR. Let F e AC(n,T,n). Then 



Let 70 be the probability measure with density — ^—!-exp(— ■ 

(2tt)2 
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(ii) Se(F,T) is finite if and only ifYl^Lo ^^F- 1 x//Fp 1 (/ x -^2P+ 1 ) < °°' Moreover 
(in) Se(F,T) = i^(2vr)+ || F |||) - H^F' 1 ) -l^^x^ 1 ^)- 

We also have 

Remark 5.10: 

Le f2, T, /x) be a dynamical system, F £ j4C(f2, T, /x) and £ £/ie process £ := (F, F o T, ...). TTien t/ie 
entropy rate of £ with respect to the independent Gaussian stationary process rj is given by 

1 00 1 

H v (0 = H^F- 1 ) + -H^^-^F'l,). 

p=0 



6 Application to Gaussian processes 

In this section we express the Shannon entropy Se(F, T) in terms of the spectral measure of the 
Gaussian process X = (X n ) ne %, when F = Xq is the zero coordinate function ( Lemma 6.3 ). This 
enables us (1) to prove that in the class of Gaussian Markovian processes, the Shannon entropy 
almost determines the process ( Remark 6.4 ), (2) to show how this entropy changes by linear 
change of variable ( Corollary 6.5 ), and (3) to prove that all unilateral Gaussian processes with 
finite Shannon entropy are isomorphic ( Theorem 6.6 ). 
We need first some preliminaries. 

Let (v n ) n >o be a stationary sequence of unit vectors in the real Hilbert space H, with (r(n)) ne z 
strictly positive definite sequence (defining r(—n) = r(n)), where r(n) =< v n ,vo >=< v n+ k,Vk >■ 
Let R n be the n x n matrix {R n )ij = r(i — j), for i,j = 0, ...n — 1 and r be the vector r = 
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[r(l),...,r(n-l)]*. Then 



Rn — 



Evidently for each n there exists a unique vector a = [a±, a n ]' such that the vector 



71- 1 



Wn ■= Vq - ^ aiVi 



1=1 

is orthogonal to Vi for % = 1, n — 1. Using the orthogonal decomposition 

fo = w n + (aivi + ... + a n _ii; n _i), 

we can prove, by taking scalar products < vo,vq >,...,< v n -i,vo >, that a is given by the equation 
r = R n -ia, or 

a = R n-l r - 



Lemma 6.1: 

For any X 1 = (xq, ...,x n -i) set Y* = (x±, x n -\). Then we have 



(49) 



X'R-'X-Y'R-^Y 



( X ~ L,=l a j x j 



and 



(50) 



det(R n ) =|| w n || 2 det(R n -i] 



In the same way, note first that we have the equality 



Rn 



(r*)* 1 
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where r % is the vector whose transpose is (r 1 )* = (r(n — 1), r(l)). And for any n there exists a 
unique vector b = [bo, b n -$ such that the vector u n defined by 

Un = v n -i - b v - ... - b n - 2 v n -2 

is orthogonal to Vj for j = 0, ...,n — 2. So v n -\ — u n is the projection of v n -\ onto the subs-pace 
spanned by vq, ...,v n -2, and u n is the projection of v n -\ onto the orthogonal of the linear span of 
{v , ...,v n - 2 }. Since 

(51) v n -i = u n + (b v + ... + b n - 2 v n -i) 

an "orthogonal decomposition", we obtain, by taking scalar products < vq, v n -\ >,...,< v n -±, v n -\ >, 
the equality r l = i? n _i6, or 

b = i?-V- 

We have also the following 



Lemma 6.2: 

If X = [xq, ...,x n _i]* andY = [x Q , x n _ 2 ]*, then 



(52) X*R?X - Y'R-^Y = ^ „ pg ^ 



and 



(53) det(R n ) =|| u n || 2 det(R n -\). 
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Proof of Lemma 6.2: Let 



Q = 



where l n -\ is the identity matrix. Then 



( t n N 

-»n-l U 

-6* 1 



V / 



Q- 1 



f In-l ^ 
b l 1 

V / 



and we have 



0...0 a 



\ 



where a = ,, 1 M ^ . In fact this equality is equivalent to 



QRnQf 



Rn-l 
0...0 a" 1 



which can be easily verified. □ 



Now let Q. = a the shift transformation, /j, a Gaussian a invariant probability measure deter- 
mined by a (strictly) positive definite sequence (r(n)) ng z, with r(—n) = r(n) for any n, so when 
r(0) = 1, there exists a probability measure v on the unit circle T such that v(n) = r(n). In this 
case we shall call v the spectral measure. In other words, each n dimensional distribution has a 
density p n given by 

1 1 

(54) Pn (x , ...,x n _i) = - — — -j- x exp(-- V (R'^ijXiXj) 

(2tt) 2 {detRn)* * iJ=0 
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where 

(R n )ij = r(i - j) = / XiXjdfi(x), i,j = 0, ...,n- 1. 
So, if F(x) = xo for x G f2, then, for z > j, 

(i?„)i, = / Foa^Fdfi. 
Jn 

Lemma 6.3: 

Let P n , Q n and Q denote the orthogonal projections onto the linear span of{X±, X n } , {X- n+ i, X_i} 

and {X_i, X_ 2 , ...} respectively. Then 

a) 

(55) Se(F,a) = hog{2^) + log \\ F - QF || 2 +^ = ±log(2ir) + logfdX + i. (**) 

where f is the density of v with respect to A. 

In particular, if \ \ F — QF || = £/ien Se(F, a) = — oo. 

6j //II F — Q-F ||> 0, £/ie following are equivalent 

(i) The Shannon information ^l n (F) converges almost everywhere [resp. in L 1 ]. 

(ii) jfYlf=i^(^ ~ PjF) 2 {a N ~i) converges almost eveywhere [resp. in L 1 ]. 
(Hi) jf^2f=i(F — PjF) 2 {a~^) converges almost eveywhere [resp. in L 1 ]. 

(iv) J2f=i(F ~ QjF) 2 {°~^) converges almost eveywhere [resp. in L 1 ]. 

( v ) ~k ^2n=2 l°9 p n (ax) converges almost everywhere [resp. in L 1 ]. 
Proof: By formula (54) and lemma 6.1 we get 



(kr\ / Pn(x ,-,a: ra _i) 1 1 (F - P n -iF) 2 {x) 
(56) - log— — r- — — = -log(2ir) + log \\ F - P n ^F || 2 +-——— _. 

Pn-l\Xl, •••) X n -i) Z Z || P — -r n _i-r || 2 



Now, since the process is Gaussian, P n _\F converges to PF almost everywhere and in L 2 , where 
P is the projection onto the linear span of {F o a, F o a 2 , ...}, and PF = E(F \ a~ 1 B), where B is 
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the Borel sigma-algebra. So, in the case where || F — PF \\ 2 > 0, it follows from (56), that 



(57) limH^^U = ho g (2n) + log \\ F - PF || 2 +l ( f ^M - 

n p ra _i(cjx) 2 2 || t — Ft Kg 

Then the equality 

(58) " if E^^^y = 4|^» " ^ /o ^^ + ^<w*(<™0 

proves that — ^logpN(x) converges almost surely [ respectively in L 1 ] if and only if — ^ 2~2n=3 
converges almost surely [respectively in L 1 ]. Now, by (56) and (58), we obtain the equality (**). 
In the same way, we get, by Lemma 6.2 

p n (x ,...,x n -i) , cfetF n , 1 (»n-i - Sj=o & i x i) 2 



7Z Z V = oM 27r ) + n l °9-TT5 + o 



p n -i(x Q , ...,x n - 2 ) 2 2 detR n _ 1 2 || 



But, if L n -2 is the orthogonal projection onto the linear span of Xq, ...,X n -2, we have || u n \ \ 2 - 
X n _i - L n - 2 X n -i |||=|| F - Q„F HI, and thus 



1 11* 

N ~ j[-logpw(xo, ...,xjv_i) + /o^pi(a;o)] = -log(2ir) + _ ^ log || F - Q n F || 2 



n=2 



+ 1 

2(iV-l)^ 2 ||F-Q n F||| V ; 



Then, in the case where || F — QF ||> 0, the sequence of Shannon informations ^l n (F) converges 
a.e. [ respectively in L 1 ] if and only if Ylj=i(F ~ QjF) 2 ^) does so. 
The other statements can be proved in a similar way.D 



We can see easily from (**) the following 
Remark 6.4: 

Let (X n ) and (Y n ) be stationary centered Gaussian Markovian processes, with the same L 2 norm. 
Then they have the same Shannon entropy if and only if either they have the same law, or (Y n ) 
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and ((— l) n X n ) have the same law. 



Proof: Let v be the spectral measure of X and v' be the spectral measure of Y. Then if 

the Poisson kernel, we have v = P r (t)dX(t), for some r and similarly 
v' = P r r(t)d\(t), for some r' . On the other hand, form (**), the equality Se(Xo,a) = Se(Yo,a) 
holds if and only if || Xq — QXq \\=\\ Yq — Q'Yq ||, where Q' denotes the projection to the 
negative coordinates of Y. But QXq = aX_i and similarly Q'Yq = 6Y_i, for some constants 
a, b. Thus the equality of the respective Shannon entropies is equivalent to | a \ = \ b |, or to 
a < X-\,Xq >= b < Y"_i,lo >, that is to ar = br' . [ Also, one can show by elementary calculus 
that J T logP r (t)dX(t) = f T logP r i(t)d\(t) if and only if | r |=| r' 



Corollary 6.5: 

Let (X n ) n& z be Gaussian stationary process with spectral measure v. Let g = J2 n <=z a n^ mt £ L 2 (u) 
and (Y n ) n£ x be the stationary Gaussian process such that Yq = J2nez a n^n- Then 



Se(Y ,a) = Se(X ,a) + f log{\ g \)d\. 



Proof: By the spectral theorem, if v' is the spectral measure of (Y n ), we have v' =| g | 2 v. So, by 
Szego Theorem 

II Y Q - Q'Yq \\ 2 = exp[ I log{\ g | 2 ^)dX] = exp[ J log(\ g \ 2 )dX] \\ X - QXq || 2 . 
Thus, by (**), we get the result. □ 



Note that, if g G -^ 1 (A), then f T log(\ g \)dX is finite if and only if there is h G H 1 such that 
| g |=| h |, and in this case, J log(\ g \)dX >| /i(0) | = | J hdX |. 

In particular if a n = for n > (or for n < 0) and g G L 1 (X) n L 2 (v) then J log{\ g \)dX > log \ 
J gdX |= log \ oq\ . 
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More particularly, if g G H 1 (T) is an outer function, then J log{\ g \)d\ = log \ f gdX |, and when, 
in addition J gdX = 0, we obtain Se(Yo,a) = Se(Xo,a). 

It is well known from Ornstein theory that a bilateral gaussian process X = (X n ) ne z with spectral 
measure absolutely continuous with respect to Lebesgue measure on the circle is isomorphic to 
the gaussian independent process. There is interest in considering isomorphism for non-invertible 
transformations (endomorphisms). The first examples of such isomorphism has been worked out 
by Parry [13] and elaborated by Hoffman and Rudolph [6] ( All the endomorphisms they consider 
are finite to one.). We consider now the unilateral transformation (endomorphism) associated to 
gaussian process with spectral measure equivalent to Lebesgue measure A. For clarity, if X = 
(X n ) n >o is a Gaussian process with spectral measure v = fdX we consider the endomorphism T v 
defined on 1R N by (T u x) n = x n+ i, for x G R N and n > 0. Then the shift T v will be isomorphic to 
the shift T\ if and only if logf is Lebesgue integrable. 

To prove this we recall some useful properties that functions in H 1 or in M 2 can have. First recall 
that, for p = 1, 2, HP is the closed subspace of all / G L P (T, ^) such that \\ f(t)e int dt = 0, n = 
1,2..., and that if < / G L 1 then logf is integrable if and only if there is F G H 2 such that 
/ =| F | 2 [ [5], Theorem, p. 53]. Recall also that an inner function / is an analytic function in the 
unit disc such that | f(z) |< 1 and | f(e ld ) \= 1 almost everywhere on the unit circle, and an outer 
function F is an an analytic function in the unit disc of the form 



where A; is a real-valued integrable function on the circle and a is a complex number with modulus 
1 [ [5], p. 63,]. For a function F G H 2 to be an outer function it is necessary and sufficient that the 
family {z n F : n = 0, 1...} span H 2 [ [5], corollary, p. 101]. Also any non zero function / el 1 can 
be written in the form / = gF where g is inner and F is outer [ [5], Theorem, p. 63, [4], Theorem 



In the next theorem the use of Shannon entropy is only to ensure that the logarithm of the density 
of the spectral measure is integrable. 




12]. 
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Theorem 6.6: 

Let v be a probability measure on the unit circle, equivalent to Lebesgue measure A, with density 
/. Then the unilateral shifts T v and T\ are isomorphic if and only if Se(Xo,T u ) is finite, or equiv- 
alently logf is Lebesgue integrable. 



Proof: Consider the two bilateral gaussian processes X' and Y' with spectral measures A and fdX 
respectively. Then on the cyclic space Zx = {X o T n : n G Z}, T is unitarily equivalent to the 
multiplication M z by z on L 2 (X). An isomorphism (f> is given by 4>{X n ) = z n ,n G Z. Similarily, 
the same holds for T on the cyclic space Zy = {Yq o T n : n £ Z} and the multiplication by z 
on L 2 (fd\), with isomorphism ^: V(^n) = ^ n ,n G Z. It follows that the action of T on Zx is 
unitarily equivalent to the action of T on Zy - Suppose first that logf is integrable. Then there 
exists FgI 2 such that v7 =1 ^ I- Moreover, there exist an inner function g and an outer function 
G G H 2 such that F = and thus | F | = | G |. Set x = 4>~ 1 G, so that x belongs to the closed 
linear span of {Xq,Xi, ...}, and we have 

< T n x,x >=< T n $- X G^~ X G >=< 4>- x M^G^- x G > 
=< M^G, G>= j z n \G\ 2 d\ = j z n fdX =< T n Y , Y > . 

On the other hand, if P = J2k a k zk ls a polynomial, the following equalities 

|| X -J2a k T k x ||=|| 0(X o )-^a fc 0(T fc x) ||=|| 1 - ^ a k M^(x) \\ 

k k k 

= || l-J2 a kM*G || = || 1 - ^a k z k G ||=|| 1 - PG ||, 

k k 

prove that Xq belongs to the closed linear space generated by {T n x : n > 0} if and only if 1 belongs 
to the closed linear space generated by {z n G : n > 0}. But, since G is outer, this later is equal 
to H 2 and thus Xq G lin{T n x : n > 0}. This proves that T v and T\ are isomorphic. The other 
implication follows from Szego Theorem. □ 
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We end this section with the following result concerning the speed of convergence in linear predic- 
tion: 

Proposition 6.7: 

Let A be the Lebesgue probability measure on T, and v the spectral measure of a stationary Gaussian 
process (X n ) n ^z- Let Q denote the orthogonal projection onto the closed (in L 2 {v)) linear span of 
the negative coordinates and Q n be the orthogonal projection onto the linear span of{X_ n , X_i}. 
Suppose that Xq / QX$, or equivalently log^ is Lebesgue integrable. Then: 

The series X^=i II QXq — Q n Xo Hi converges if and only ifv is absolutely continuous and v = e*d\, 
with Y^=i 71 I f( n ) P< 00 . 

An equivalent form of Proposition 3 is 
Remark 6.8: 

Let v be a probability measure on T. Let H, H n denote the closed subspaces of L?{y) spanned by 
{e tkt : k > 1}, and {e tkt : 1 < k < n} respectively. Let F, F n be the orthogonal projection of the 
constant function 1 onto H and H n respectively. Suppose that 1 is not in H . Then the following 
are equivalent 

(VY,n=l\\F-Fn\\ 2 <™. 

(ii) v is absolutely continuous with respect to the Lebesgue probability measure A and v = e^d\, 
with £~ =1 n|/(n) | 2 <oo. 
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7 I? action, pointwise statement 



In this section we consider specifically absolutely continuous II 1 processes, for which we prove 
pointwise convergence of the Shannon entropy. The case where n = 1 has already been considered 
by Barron [1]. However his method can not extend to the higher dimensional case; the idea of our 
proof is very related to the one by Ornstein and Weiss [10] for the Z n version of the Shannon Mac 
Millan Breiman Theorem. The proof is given for n = 2, but it can be easily generalized. 

(2) 

Notations are as in sections 4. Particularly, we refer to (16), (18) for h n and to Definition 4.11, 
Lemma 4.10 and Lemma 4.15 for Se(F,T, S). Namely, f n 2 is the density with respect to Lebesgue 
measure of the law of F n i := (F o T m o S n )„ l: n=o,...,n~i, and h^ 1 = —logf n 2 o F n 2. The aim of this 
section is to prove the following theorem: 
Theorem 4.1: 

Let T,S be commuting measure preserving transformations on the probability space /x) with 

ergodic joint action. Let F G L 2 (fi) such that the process (F o T m o S n )t mn \ e %2, is absolutely 
continuous, with law v. Let vq be the law of the process (F o T m o S n )r mn \ e ^2( mn \^f 00 \ . 

1 (2) 

Assume that Se(F, T, S) is finite (which is equivalent to H ^ F -i xvo {u) < oo). Then -^zhn converges 
almost everywhere and in L l (n) to Se(F,T, S). 

Ln case Se(F,T,S) = —oo the previous convergence still holds almost everywhere. 

1 (2) 

Proof: We establish first the invariance of liminf ^zhn ■ Next, with the help of a reduction, we 

prove that this liminf is in fact almost everywhere a limit. 

Recall that, for every n, = (x s j) s ,t=o,...,n is, as in Section 4, formula (ao). 

1 (2) 

(a): Let h* := liminf„ -^h n . Then h* is invariant by each action. 

Proof of (a): Let ^(o,n)x(o,n) : = X n% 1 , #(i,n)x(o,n-i) : = { x i,j ■■ 1 < i < n,0 < j < n - 1} and 
w n = ^ihffl ' . Then 

W n O S 5 W n+1 =y n + Z n , 

where y n := ^logrv^i ^^"if^ttw and z n := \log(T\™ f(x ,-) x n?-i f{x in ))- 

Now z n converges to almost everywhere by the pointwise ergodic theorems. For the first one 
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y n , define, for e > 0, A n (e) = A n by 

n n 

A n := {X : f(X^) < J] f(x 0J ) x Y[f(x i>n ) x /(^(i,n)x(o,n-i))}- 

j=0 i=l 

Then, a simple calculation yields p(A n ) < e~ n e and thus for p almost all x there is p such 
that —y n {x) < e for all n > p , and this implies h* < e + h* o S . Hence h* < h* o S. It follows 
that h* = h* o S. 

In the same way, we have also h* = h* o T and this proves (a). 
We prove now that 

(b) // lim„ ^jHn^ is finite then the family {-^zhff ■ n > 1} is p uniformly integrable. 

Proof of (b): Recall that v and p are defined by their respective marginals v n , p n as in (20) and 
(21) respectively, and gfj is as in (32). Also L is as in (02) in subsection 4.1. We prove that the 
family {loggfj : (i,j,n) £ L} is p uniformly integrable, and this will imply, by the equality (E) 
in Remark 4.12, that (^hn^) is p uniformly integrable. For I = (i,j,n), denote gfj by pi, and 
let Ik = (ik, jk,nk), be an infinite sequence in L. We shall prove that (logpi k ) contains a weakly 
convergent subsequence, and this proves (b). By Remark 4.1 fi?), (Ik) contains a strictly increasing 
subsequence which we still denote (Ik)- Let := M^T^. By the formula (34) and Remark 4.13, 
pi k converges v almost everywhere to p^ := ^r L , where p^ and are the restrictions of p and 
v to respectively . Also we have 

> J -logp^dp = -H Uac (p OQ ) > -H v (p) > -00. 

That is logpoo is p integrable. 

But sup fc J pi k logpi k dp < supj J pilogpidp < 00. Hence, by Remark 4.14, {(logpi k ) + : k > 1} is 
uniformly integrable with respect to p. Set Yk = logp\ k and Y = logpoo, so that Yk, Y € ^(p), Y k 
converges p almost everywhere to Y and J Y k dp converges to f Ydp. It follows that Y fc + converges 
p almost everywhere to Y + , and thus, because (Y^~) is p uniformly integrable, the convergence 
holds in L l (p) too. In particular, fY^dp converges to jY + dp. So jY^dp converges to fY~dp, 
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and thus, since Y k converges n almost everywhere to Y , it converges in L 1 ^). This proves that 
(Yfc) converges in L 1 ^) and a fortiori it is \i uniformly integrable. 



Now we proceed to prove that -^ihn^ converges \x almost everywhere. We begin by showing that 
(c) We can reduce ourselves to the case where the density of the law of the first coordinate is greater 

1 (2) 

than one on its support, and also where liminf„ ^jhn < 0. 

Proof of (c): Let F : R ZxZ — > R be the projection to the (0,0) coordinate, with absolutely 
continuous law with density /o = — , and a > 0. Let </> : R — > R be the map defined by 

4>{x) = - r f (t)dt. 

a J-oo 

Put G = <poF. Then the law of G is absolutely continuous and has density go given by go = al^ iy 

Let now $ : R ZxZ -► R ZxZ be the map defined by: (<S>(x))ij = 4>{x id ),\/i,j G Z, so that $S m T" = 
S m T n <Z>, and let m := n o Define by 

0(x) = sup{t G R : 0(t) = x}. 

Then the finite dimensional marginals of m are absolutely continuous with respect to Lebesgue 
measure and the following relationship holds between the densities / for fx and g for m: 

5((^,j)i,j=0,...,n-l) = /((^Ki))ij=0,...,n-l) ]J 

i,j=0,...,n-l 

with the property: for almost all t, go(t) > =>• <7o(i) > a - It follows that 

1 1 1 n_1 

-^io^((«ij)ij=0,...,n-l) = -^OS/((0Kj))M=O,...,n-l) - ^ S ^'Ki)- 

But it is easy to see that log#' o F G L l (m) if and only if J fo(t)logfo(t)dt is finite. Also 



y Zogtf' o Fdm = loga - J fo{t)logf (t)dt. 
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So if J f (t)logf (t)dt is finite, the sequence 4? lo 9^'( u i,j) = ^ lo 9^' oFo S l T j (u) con- 
verges m almost everywhere to f logO' o Fdm. Therefore — ^/o^((njj)j J= o,... )n -i) converges 
m almost everywhere if and only if — ^logf((0(ui t j))ij=o,...,n-i) does so. In this case the corre- 
sponding limits (or liminf), h* and g* verifiy 

g* = K ~ loga + J f (t)logf (t)dt. 

So if h* is > 0, then if we take a such that 

loga > h* + J f (t)logfo(t)dt, 

we obtain g* < 0. 

Therefore, we can suppose that for almost all t, if fo(t) > then fo(t) > a > 1 and h* := 
liminf -Kh n < 0. 

This finishes the proof of the announced reduction (c) . 
As in the Ornstein- Weiss case, we prove that 

1 (2) 

(d) liminf —^hn is almost surely a limit. 

Proof of (d): Put a := h*. By (c), we can and do suppose that a < 0. Let £3 > 0, and < e < £3. 
Let Iq G N, ei, 62 > and 61 > 0, to be chosen later. 

First, by the pointwise ergodic theorem, we find two sequences (ki) and (mi) of natural numbers 
converging to 00 such that (^ L ) converges to 00 as fast as we wish, a set C f2, with /u(f^) > 1— <5i, 
and an integer iV (ei) such that 

VxGO e 1 ,ViV>iVo(e 1 ),3J 7 v(x) C/^i^.!, N 2 {1 - 5 1 - < cardJ N {x), 

and 

VZ,V(i,j) G J JV (x),3n(i,j,Z) G [fc,,m,], e -» 2 («.')(«+ e ) < + X™^},^ _ x ). 

Next, by repeated uses of a Vitali covering type Lemma, we get an upper estimate of the (Lebesgue) 
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size of the set £1], which enables us to majorize the measure of the set 
which will ensure the convergence of the series 

N 

in order to get limsup n ^zhffl < h*. The details are as follows: 

It is easy to see, from the definition of a, that, given eo > 0, there exist a set fi e , with /j,(tl £ ) > 1 — eo, 
and two strictly increasing sequences of natural numbers ki, mi, with ^ converging to infinity as 
fast as one wishes, such that 

V/,Vx G n t ,3m = m(x) G [fc,,m,], e~ n > + ^ < f(x) = /(X^^). 

Let Jn{x) := ■ < i,j < N;T l S^x G Q e }- Then, by the pointwise ergodic theorem, for any 

5 > 0, there is a measurable set Ag with /j,(As) > 1 — 8, and a naural number iVo(ei, e, 5), such that 
for all N > No(ei,e, 8), and for all x e A§, it holds 

N 2 {v(n t ) - ei) < cardJ N (x) < N 2 (fi(Q e ) + ei). 

Put <5i = eo + <5, and := ri e n As. 

It follows //(fig) > 1 — Si, and for x £ Ql and (i, j) G Jjv(ar) there exists n(i,j,l) = n(i,j,l)(x) G 
[fc/, m;] such that 

i Dy i e < /U*>J; +^n(i,j,l)-l,n(i,j,l)-l)> 

where we denoted /(tfT's) by f((i, 3) + ^gj^^.J. 
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Note that R i:j ,i{x) := + X 



-n(i,j,l) 

n{i,j,l)-l,n(i,j,l)-l 



is the square with first vertex and with side 



having length n(i,j,l). Then, for any /, {Rijj(x) : G Jjv(x)} is a finite cover of Jn{x) 

by squares. So, due to the freedom, mentioned above, in the choice of k\ and mi, by repeated 
applications of the Vitali covering Lemma, [Mattila] , for any e 2 there is Nifa, e) and l\, > Iq, 
such that for any N > Nifa), there exist subsets Jn,h ( x )-> ■•■> ^N,i k {%) of Jn(x), such that the 
squares {Rij : i s (x) : G Jn,i s {x),s = l...k} are disjoint, and there is a subset C Jn{x) 

which is covered by {Ri,j,i a (x) : E Jn,i s (x),s = l...k} and with (1— €2)cardJ^{x) < cardJ^{x). 
It follows that, for N > N Q (e 1 , e, 5)) V iVi(e 2 , e) and xGQ], we have 

iV 2 u < cardJ%(x), 

where u := (1 — 62)(1 — <5i — ei). Then 

fc 

(60) iV 2 u <^ J] n 2 (i,j,Z s )(x). 

s=1 (i,j)£JN,l s 

Thus, since a < 0, if e > is chosen such that a + e < 0, 

fe 

(61) iV 2 u(a + e)>^ n 2 (i,j, l s )(x)(a + e). 

s=i (i,j)eJ N ,i s 

Let J}f(x) be the intersection with N _ 1 of the union of the squares {i?j j5 ; s (x) : G 
Jn,i s (x),s = l...k}. Then if £ J\{x) we have that 1 < ^f(xij). In particular 



a>l=>l<f(xij),V(i,j) iJ]i{x). 



Since a > 1, it follows, by (59), that 



1 < e 



n 



s=l,...,fc,(ij)G^JV,! s ( ;c ) 



49 



So by (61), we get for i £ O), 



(62) 1 < e NMa + e) { Yl /(^(X)))X ]J fir;,). 

s=l,...,k,(i,j)eJ N , le (x) (ij)£Jk( x ) 

\N 2 0] 

But the number of all configurations of such disjoint squares is majorised by C N2 , which is 
majorised by ce Ar2h ( /3 ' 1_/3 ), where j3 = 4j, q being the smallest fc/'s, and c is a constant. Then by 
(62), the Lebesgue measure A(f2*) of Vl\ is, for N big enough, majorized by ce N2h ^' l ~^ x e Ar2 «( a + e ). 
But easily, 

M{/«-i,at-i) < e^ 2 (^) } n n i } < e -^(-+e,) A(n i ) . 

So for N big enough, we then obtain n(if( X N-i,N-i) ^ e~ N2( - a+€ ^} n fij) < ce"^, where 
oj = a + e% — h((3, 1 — P) — u{a + e). But there is a constant 7 > 0, such that for N big enough, 
the exponent u is > 7. In fact, if we put v = e± + <5i, then u = 1 — u — £2(1 — v), and thus 
u = €3 — e — h((3, 1 — (3) + (a + e)(f + 62(1 — u)), so that, for < 7 < ^jp, we can choose e\, €2, 
Si, Iq and N2 > Nq V iVi , such that ViV > AT 2 , we have the inequality 

e 3 - e > 2 7 > 7 > /i(/?, 1 - /3) - (a + e)(u + e 2 (l - u)). 

Then the series 

AT 

is convergent. Letting <5i — > 0, we get limsup^v wn{x) < a + €3, and finishes the proof in the case 
— 00 < a : = liminfAT wjy < 0. 

If liminf wn = —00, taking a any negative number, the same proof gives limsup^v wn < a. 
This finishes the proof of theorem 4. 
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